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USING  STEIN'S  ESTIMATOR  TO  PREDICT  UNIVERSE  SCORES 

FROM  obtaini:d  scores 


^^The  (>urpos«  of  this  paper  is  to  introduce  and  apply  a  recently 
developed  statistical  method  for  estimatinq  true  (population)  scores 
from  observed  (s^usple)  scores.  Provided  that  at  least  three  scores 
are  available,  this  method  overall  will  give  more  accurate  true  score 
estimates  than  the  individual  maximum  li)celihood  estimates  (MLE) , 
regardless  of  the  true  abilities  of  the  examinees  (Efron  k  Morris, 

1977) .  The  method  can  be  used  without  (knowledge  of  the  (Bayesian) 
prior  distribution,  and  normality  of  the  true  scores  being  estimated 
need  twt  be  assumed.  The  theoretical  and  practical  implications  of 
the  tqethod  extend  beyond  psychological  measurement  to  the  very  founda¬ 
tions  of  statistical  inference  and  have  caused  some  tumult  in  that  dis¬ 
cipline  during  the  (>ast  decade. 

Y  ' 

HISTORICAL  OVERVIEW 

For  the  Gaussian  distribution,  the  average  is  the  best  estimator 
of  the  true  mean,  9.  The  average  is  said  to  be  "unbiased”  because  no 
single  value  of  9  is  favored  over  any  other  value.  That  is,  the  ex- 

(>ected  value  of  the  average,  x,  equals  the  true  value  of  9,  regardless 

of  the  value  of  9.  How  many  unbiased  estimates  of  9  are  there?  An 

infinite  number.  But,  none  of  them  estimates  9  perfectly.  The  expected 
squared  error  of  estimation  for  the  average  is  lower  than  that  for  any 
other  linear  or  nonlinear  and  unbiased  function  of  the  data. 

A  de|>arture  from  this  classical  approach  assumes  that  unbiased  esti 
mates  of  9  arc  not  the  only  methods  by  which  to  infer  jopulation  values. 
For  example,  other  possible  estimates  of  9  could  be  the  median,  x/2,  2x, 
the  mode,  etc.  All  such  estimators  can  be  compared  through  a  risls  func¬ 
tion,  winch  is  the  expected  value  of  the  squared  error  for  every  possibl 
value  of  9.  Plots  of  risk  functions  show  that  there  is  no  estimator 
with  a  risk  function  that  is  ever>'where  lower  than  the  risk  function  of 
the  average,  x,  provided  that  a  single  mean  is  being  estimated.  But  in 
the  more  general  case,  a  score  is  available  from  each  of  many  examinees 
who  have  taken  a  test,  for  example,  and  it  is  the  true  score  of  each 
examinee  that  is  to  be  inferred.  Thus,  the  MLE  is  merely  a  specific 
case  of  the  mure  general  situation  where  the  mean  scores  (9's)  are 
sought  for  each  examinee. 

Theoretical  work  conducted  by  Stein  (19SS)  and  by  Jastes  and  Stein 
(1961)  concentrated  on  estimating  several  unknown  means,  through  methods 
other  than  maximum  likelihood  estimation.  The  autliors  assumed  that  the 
means  are  independent  of  each  other  and  that  the  goodness  of  various 
estimators  can  be  assessed  by  a  risk  function:  the  sum  of  the  expected 
values  of  the  squared  errors  of  estisiation  for  all  of  the  individual 
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tneans.  Kl»o,  it  la  not  naceasary  to  aaaune  that  the  meana  beinq  eati- 
nated  come  from  normal  diatributiona .  What  Jamea  and  Stein  proved  is 
that  when  three  or  more  means  (8  values)  are  being  estimated,  it  is  a 
less  than  optimal  solution  (’’inadmissible**)  to  estimate  each  8  from  its 
own  average.  That  is,  estiaiation  rules  can  be  found  with  smaller  total 
risk  regardless  of  the  values  of  the  true  means  (8's)  for  each  examinee. 
As  Efron  and  Morris  (1975)  express  this  accomplishment: 

Charles  Stein  showed  that  it  is  possible  to  ma)ie  a  uniform 
improveskent  on  the  maximum  li)(elihood  estimate  (MLE)  in 
terms  of  total  squared  error  ris)i  when  estimating  several 
parameters.  .  .  .  This  achievement  leads  immediately  to  a 
uniform,  nontrivial  improvement  over  the  least  squares 
(Gauss-MarKov)  estimators  for  the  parameters  in  the  usual 
formulation  of  the  linear  model  (p.  311) . 


THE  STEIN  ESTIMATOR 


The  following  discussion  serves  as  an  introduction  to  the  Stein 

estimator.  Assume  that  we  have  k  txarameters  8,,  8^,... 8,  ,  k  2.  3,  and 

12k 

that  for  each  8^  we  observe  an  independent  nonsal  variate  x^  with  mean 
E,  X,  =  8  ,  and  variance  Var_  (x, )  •  1.  Note  that  each  x,  might  be  the 

8i  1  1  8^  i  2  i  ’  2 

mean  of  n  indei>endent  observations  Y  'iTi(8.,  0  ).  Then  x.’'^(8.,  O  /n)  , 

^  1  j  1  11 

and  a  change  of  scale  transforms  0  /n  to  the  more  convenient  value  of 
1.  Therefore,  the  above  assumptions  often  occur  as  a  reduction  from 
more  complicated  situations  to  this  canonical  form. 


The  primary  objective  for  applying  the  set  of  estimation  rules  is 
to  estimate  the  unknown  vector  of  sieans  8,  8  i  82,...8^).  The 

performance  of  an  estiSMtion  rule  is  assessed  by  computing  the  sum  of 
squared  coegxmeht  errors  that  is  the  squared  error  loss  for  that  esti¬ 
mation  rule.  If  7  ■  (/.,  /  )  is  an  estimation  rule,  where  / 

is  the  estimate  of  8, ,  then  the  squared  error  loss  L(8,  /)  is  defined 

as  L(8,  7)  =  I  if,  -  «.)  . 

i  -  1  ^  ^ 

In  the  case  of  the  maximimi  likelihood  estimator,  or  the  sample 
mean,  denoted  by  7^  (X),  7^  (X)  =  (/^(X),  (X)  , . .  ./^(X) )  =  (x^,...X|^), 
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there  is  a  constant  risk,  R,  with  R( 


5.  r 


(X))  -  R(e,  X) 


(x  -  9.)  >  K.  (Note  that  E  indicates  the  expectation  over  the  dis- 

*  ^  1  2 
tribution  x.|9  .  N(9  ,  1)  introduced  above.  Observe  that  E,(x.  -  w.) 

i  1  1  9  1  1 

-  1  tor  each  i.  i  • 


The  Stein  estimator  may  be  used  to  estimate  9.  Define  the  Stein 
estimator,  7^(X)  =  (/^(X),  /^(X) , . . ./^(X) ,  K  >.  3,  as  follows: 

where  u  =  (u.,...U|^)  represents  an  initial  quess  at  the  true  mean,  9, 
i  K  2 

and  S  is  defined  by  S  =  ZCx^  -  .  This  estimator  thus  has  risk 

R(9,  7^(X))  =  E^  r  (/^  (X)  -  9^)^  <  k - - ; 

^  i  -  1  i  i 

for  all  9.  If  9^  ■  for  all  i,  the  risk  is  2,  which  compares  quite 
favor/ibly  to  k  obtained  for  the  sample  mean.  In  any  event,  the  risk 
for  the  Stein  estimator  is  less  than  that  for  the  maximum  likelihood 
estimator.  A  discussion  of  how  the  risk  for  the  Stein  estimator  was 
obtained  is  presented  in  the  last  section  of  this  paper. 


The  Stein  estimator  has  a  very  natural  interpretation  in  an  empiri¬ 
cal  Bayes  context.  If  the  9^  themselves  are  a  sample  from  a  prior  dis¬ 
tribution,  9,  N(u  ,  I^),  i  ■  l...k,  then  the  Bayes  estimate  of  9. 

i  ^  1  ,  i 

is  the  .1  poatericri  mean  of  9^  given  the  data,  and  defined  by 


;^(x^)  - 


(Xj  -  . 


In  the  empirical  Bayes  situation,  T  is  unknown,  but  it  can  be 

estimated  because  marginally  the  x.  are  independently  normal  with  means 

k  ^ 

2  2  2  2 

U  and  S  ■  Z  fx  -  u.)  (1  ♦  T  )  where  is  a  chi-square  distrlb- 

ution  with  k  degrees  of  freedom.  Given  that  k  2.  3,  the  unbiased  estisute 


(k  -  2) 


1 

1  ♦ 


is  available. 
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Substituting  -  -  —  for  the  unknown  - — ^ — j  in  the  Bayes  estimate 
.*  /“  k  -  2\  1  ♦  T 

j  results  in  u.  *11  ~  — 1  (x  -  U,).  which  is  the  Stein  estimator. 

1  1  ^  S  ^  1  i 

Predicting  Batting  Averages  Using  the  Stein  Estimator 

The  following  ex^UBple  is  adapted  froei  Efron  and  Morris  (197S, 
1977).  Batting  averages  for  major  league  baseball  players,  based  upon 
their  first  4S  times  at  bat,  were  obtained.  The  objective  was  to  pre¬ 
dict  each  player's  batting  average  for  the  remainder  of  the  season.  A 
cutoff  after  the  first  4S  times  at  bat  was  chosen  because  that  number 
was  large  enough  to  insure  a  satisfactory  approximation  to  the  binomial 
distribution  by  the  normal  distribution  and  because  the  vast  majority 
of  "at  bats"  for  the  season  would  be  estimated.  The  model  assumes  that 
hits  occur  according  to  a  binomial  distribution  with  independence  be¬ 
tween  players.  (Requiring  the  same  number  of  trials  for  all  players, 
n  ■  4S,  assures  equal  variances;  however,  the  Stein  estimator  can  also 
be  used  when  variances  are  unequal.  See  Efron  and  Morris,  1975.) 


Let  Y  be  the  batting  avet  ige  of  player  i,  i  >  l,...k  (k  ■  12) 

^  ind 

after  the  first  45  times  at  bat.  Assume  that  nY,  „  Bin(n,  p, ) , 

1  ^  1 

1  •  1,...12,  where  p^  is  the  true  season  batting  average,  i.e.. 


Because  the  variance  o*  depends  upon  the  mean,  the  arc-sin 

transformation  for  stabilizing  the  variance  of  a  binomial  distribution 

is  applied:  X,  ■  f.r(Y,),  «<here  f  (y)  ■  n**arc-sin{2y  -1).  It  can 
1  49  1  n 

be  shown  that  this  transformation  results  in  x^  having  nearly  unit 
variance  independent  of  p^. 


The  mean  9,  of  x,  is  given  by  9,  -  f  (p.).  Values  of  Y. ,  p, ,  x,  , 

^  ii’  in  i  i  ii 

9^,  and  p^  are  listed  for  players  1  through  12  in  Table  1.  Batting 
averages  for  the  first  45  times  at  bat  are  listed  in  column  1.  Each 
player  received  from  270  to  590  additional  "trials”  during  the  season. 
The  l>atting  averages  for  this  seasonal  trial  number  are  listed  in 
column  2.  Recall  that  the  objective  here  is  to  predict  each  player's 
coluau:  2  ("true,"  "population")  valuw  using  the  initially  obtained 
column  1  ("sample")  value. 
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T«i>le  1 


Example  Using  Batting  Averages  Frexa  12  Players 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

Y 

P, 

9 

/; 

A 

P, 

1 

1 

1 

1 

1 

1 

.400 

.  346 

-1.35 

-2.10 

-2.49 

.319 

.378 

.298 

-1.66 

-2.79 

-2.60 

.311 

.356 

.276 

-1.97 

-3.11 

-2.71 

.303 

.333 

.222 

-2.28 

-3.96 

-2.82 

.296 

.311 

.270 

-2.60 

-3.20 

-2.93 

.297 

.289 

.263 

-2.92 

-3.32 

-3.03 

.283 

.244 

.269 

-3.60 

-3.23 

-3.26 

.265 

.222 

.  303 

-3.95 

-2.71 

-3.40 

.258 

.222 

.264 

-3.95 

-3.30 

-3.40 

.258 

.222 

.226 

-3.95 

-3.89 

-3.40 

.258 

.200 

.285 

-4.32 

-2.98 

-3.53 

.249 

.178 

.316 

-4.70 

-2.53 

-3.66 

.241 

Note . 

Listing  of  the 

HIE  Scores  and  Estimated  Universe  Scores  (columns 

1 

and  2) ,  Score  Transformations  (columns 

3,  4,  and  5) , 

1  and  the 

Estimated  Universe  Score  from  using  Stein's  estimator 

(column  6) 

The 

X  values  obtained  upon  application  of 

the  arc -sin  transforms- 

tion 

to  the  column  1  batting  averages  (observed 

scores)  are  shown  In 

column  3. 

Similarly, 

the  9^  values  obtained  by 

applying  the 

arc -sin 

transformation  to  the 

colusin  2  batting  averages 

are  sho«m  in 

column  4. 

The  Stein 

estlsMtor  values  that  estimate  the  9^ 

are  shown  In 

column  5, 

and  the  values  obtained  upon  retransformation  via  the  arc-sin  trans¬ 
formation  are  9iven  in  column  6.  The  followinq  calculations  are  exam¬ 
ples  of  the  type  of  coetputations  required.  Note  that  the  ccxnputations 
are  not  at  all  complex. 

Tor  i  -  2.  -  45‘*arc-sin(2(.378)  -  1)  -  -1.66.  Therefore, 

X,  •  -1.66,  and  IS  entered  in  column  3.  Similarly,  »  ■  f  (p  )  - 

2  n  2 

45  arc-sin  (2(.298)  -  1).  This  value  is  qiven  in  column  4.  Values 
for  x^,  Xj,...x^^  and  9^,  obtained  through  similar 

substitutions . 


The  basic  et}uation  for  the  Stein  estimator  f which  allows  us  to 
estimate  the  ith  component  of  8,  is  slightly  different  from  the  expres¬ 
sion  introduced  previously.  We  estimate  the  initial  guess  U  ■ 


by  X  -  Ix^/k,  which  shrinks  all  x^  toward  X.  The  resulting  estimate 
of  the  ith  cosgxjnent  of  9  Is  given  by 

(*  - 


/|(X)  -  X  ♦ 


2 


where  V  ■  Z(x^  -  XI  ,  and  k  -  3  » 


(k  -  1)  -  2,  because  one  perimeter  Is  estimated. 


In  the  empirical  Bayes  case,  the  appropriateness  of  this  formula- 

k  -  3 

tion  follows  if  T  la  used  as  the  unbiased  estimate  for  ii  and  -  as 

i 

the  unbiased  estimate  for  - — Therefore,  in  the  case  of  the  exam¬ 
ple  data  provided  in  Table  1, 


X  •  Ix^A 


(-1.35)  (-4.70) 

12 


-3.10. 


rhe  value  for  y  may  in  turn  be  used  to  compute  V: 

V  -  E(x^  -  X)^  -  (-1.35  -  (3.10))^  ♦...♦  (-4.70  -  (-3.10))^  -  13.81. 
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The  Stein  estimates  for  derived  by  substituting  the 

obtained  values  for  JT  and  V  in  the  cosiputational  equation  i 

(X)  -  -3.10  *  ‘*i  ■  "  .350x^  -  2.02. 

For  example,  (X)  ■  .350{-l.35)  -  2.02  ■  -2.49.  This  value  and 
the  values  for  ^2’ ’’'^12  listed  in  column  S  of  Table  1.  These  values 
are  finally  retransf omed  to  obtain  the  estimates  of  the  "true  score" 
average  for  each  player  in  column  6. 

■Jl  * 

The  total  squared  prediction  error  for  }  (X)  is  defined  as 

*’1  2  ‘*1  2 

■♦■...♦  ^^12  ~  ®12^  "  This  value  is  obtained  by 

subtracting  the  column  4  value  from  the  column  5  value  for  each  of 

the  12  players,  squaring  the  differences,  and  summing. 

In  the  case  of  the  sample  mean,  X,  the  total  squared  prediction 
error  is  defined  as  E(x^  -  ■  15.135.  This  value  is  obtained  by 

subtracting  the  column  4  value  from  the  column  3  value  for  each  of  the 
12  players,  squaring  the  differences,  and  susssing. 

The  adequacy  of  Stein’s  estimator  relative  tn  the  sample  mean  may 
be  determined  by  computing  their  relative  efficiency.  The  efficiency 
of  Stein's  estimator  relative  to  the  sample  mean  is  defined  as 


j;(/J(X)  -  0^)^ 


In  this  example,  the  efficiency  is  3.746.  In  other  words,  Stein's  esti¬ 
mator  is  nearly  four  tistes  as  "efficient"  in  predicting  "universe"  or 
"true"  scores  from  observed  (sample)  scores  as  is  the  MLE. 


Lxaited  Tranalation  E»tia*torg 


Stein  estimators  achieve  uniformly  lower  aqgreqate  risk  than  the 
HI£  (swple  mean),  as  shown  above,  but  may  result  in  increased  risk  to 
individual  components  of  0.  In  particular,  the  Stein  estimator  may  do 
poorly  in  estimatirq  0^  with  very  larqe  or  very  small  values.  Therefore, 
even  thouqh  f^iX)  provides  better  prediction  in  the  aqqreqate,  one  may 
qrossly  err  with  individual  components.  A  desirable  compromise  would 
be  to  have  both  qood  aqqreqate  and  qood  individual  prediction,  where 
improved  individual  prediction  would  occur  with  minimal,  if  any,  loss 
in  aqqreqate  prediction  efficiency.  This  tradeoff  may  bo  achieved  by 
usinq  "limited  translation  estimators"  that  reduce  individual  risk  for 
outlyinq  cases  and  result  in  minisial  loss  in  aqqreqate  prediction. 


Limited  translation  estimators  are  introduced  to  reduce  lotential ly 
larqe  mean  squared  prediction  errors  associated  with  individual  compo¬ 
nents.  Shrinkaqe  of  values  toward  values  is  accomplished  throuqh 

the  estiBiate  /* ,  0  £  s  <.  I ,  of  0.  .  (Here,  ■  x  and  /^  ■  /^  )  .  is 

i  iiiii 


defined  to  he  as  close  to  f 


bv  more  than 


from  Xj  .  D|^  ^(s)  Is  a  constant,  obtained  fr< 

translation  estimators  (Ffron  t,  Morris,  1972)  . 


as  close  to  ; ^  m  possible,  i 

t(k  -  1)  (k  -  3)  S  1  f-  .  .1 
- w -  [\  -  .'•> 

#  _  V  1  _  _ _ A.  _  —  ^ 


possible,  so  lonq  as  It  does  not  differ 


standard  deviations  of  x. 


(a)  Is  a  constant,  obtained  from  a  table  of  limited 


Data  from  the  baseball  example  will  now  be  used  to  illustrate  the 

application  of  limited  translation  estimators.  Notice  in  Table  1  that 

the  first  player's  season  avcraqe  far  exceeds  the  season  averaqes  of 

the  resiaininq  players,  an  example  of  an  outlyinq  case.  In  the  baseball 

example,  k  ••  12,  and  V  was  found  to  be  13.81.  Therefore,  by  obtaininq 

values  for  D.  .(.9)  and  D.  .(.8)  from  the  Efron  and  Morris  (1972)  table, 
k-1  ^  k-1 

it  is  found  that  f"  (X)  may  differ  by  no  snre  than  .75  from  x  and 

'.8  -*  ^  ^ 

/  (X)  may  differ  by  no  store  .56  from  x  .  In  other  words,  by  apply- 

inq  it  means  that  if  \I  -  *. li  -75,  then  /  is  retained)  but  if 

|/^  -  -75,  /*  is  set  equal  to  the  value  differinq  from  x^  by  .75. 


8 


1  .0 

Tai>l«  2  containB  values  for  the  12  players  for  p  ,  Y  ,  p  ,  pj  , 

8  .9.8  1111 

and  p*  .  Values  for  p*  and  p*  are  obtained  as  follows.  Consider 

the  first  player,  •  -1.35,  and  .V  -  -2.49;  therefore  |x^  *  "^i^  " 

1.14  >  .7S,  80  -  -2.10,  and  |x  -  -  1.14  >  .56.  Thus. 

~  8  ^  ^  - .  9 

•  -1.91.  These  values  are  retranslated  to  obtain  p]  •  .346,  and 
1  -i 

8  /s  9 

p’  »  .360.  Notice  that  p  "  .346.  Therefore,  p'  provides  better 

^  *  18^  ..6 

prediction  for  this  individual  than  p  or  p'  .  Also  note  that  p‘  is 

-1  ^  ^ 
closer  to  the  value  than  .  All  three  prediction  estimates  are 

closer  than  the  ML£  value  of  ■  .400.  In  the  case  of  the  second 
player,  though,  the  p*  value  became  farther  restoved  from  p^  as  the 
value  of  a  decreases  from  1  to  .9  to  .8.  Therefore,  the  translations 
are  increasing  the  squared  prediction  error  for  that  player  rather 
than  decreasing  It.  In  the  case  of  the  fifth  individual,  1?^  -  x^|  <  .75 
and  |/|  -  x^l  <  .56,  so  the  estimated  value  remains  the  same  under  trans¬ 
lations  s  “  .9  and  s  «  ,8.  The  estimated  value  will  not  change  until 
|/^  ~  ^  this  particular  example,  the  translation  is  in¬ 

creasing  the  error  for  many  individual  components  by  increasing  the  dif¬ 
ference  between  the  estimate  and  the  true  scor.j. 


-» 

Recall  that  the  efficiency  of  Stem's  estisiator,  /  (X>  ,  relative 
to  the  sample  mean  was  defined  to  be 
2 

-  3.746. 


.9.^. 


The  efficiency  of  the  limited  translation  estimator  /’  (X)  relative  to 
the  sample  mean  is  defined  to  be 


E(/;’  - 


.•8, 


which  equals  3.077.  Similarly,  for  /*  (X)  the  relative  efficiency 

equals  2.462.  Therefore,  in  this  example  /^(X)  has  the  greatest  effi- 

19  *8 

ciency  of  the  three  estimators,  /  .  ,  and  /*  . 
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Table  2 


Batting  Averages  and  Their  EstiMtes 


Y.(Ki) 


.346  .400  (-1.3S)  .319  (-2.49) 
.298  .378  (-1.66)  .311  (-2.60) 
.276  .356  (-1.97)  .303  (-2.71) 
.222  .333  (-2.28)  .296  (-2.82) 
.270  .311  (-2.60)  .288  (-2.93) 
.263  .289  (-2.92)  .282  (-3.03) 
.269  .244  (-3.60)  .265  (-3.28) 
.303  .222  (-3.95)  .258  (-3.40) 
.264  .222  (-3.95)  .258  (-3.40) 
.226  .222  (-3.95)  .258  (-3.40) 
.285  .200  (-4.32)  .249  (-3.53) 
.316  .178  (-4.70)  .241  (-3.66) 


a 

(A  -  1)  {k  - 

3) 

S 

kV 

b 

ik  -  1)  ()i  - 

3)  1 

S 

kv 

.346  (-2.10)  .360  (-1.91)  1.14 

.324  (-2.41)  .338  (-2.22)  .94 

.303  (-2.71)  .316  (-2.53)  .74 

.296  (-2.82)  .296  (-2.82)  .54 

.288  (-2.93)  .288  (-2.93)  ,33 

.282  (-3.03)  .282  (-3.03)  .11 

.265  (-3.28)  .265  (-3.28)  .32 

.258  (-3.40)  .258  (-3.40)  .55 

,258  (-3.40)  .258  (-3.40)  .55 

.258  (-3.40)  .258  (-3.40)  .55 

.246  (-3.57)  .234  (-3.76)  .79 

.222  (-3.95)  .210  (-4.14)  1.04 

.75 

.56 


Relationahir  Between  Aggr< 
Squared  Prediction  Errors 


late  and  Individual 


sent  Mean 


Prior  infonsation  about  certain  exaalnees  can  be  used  to  produce 
iBodified  estisMCes  ot  their  true  or  universe  scores.  In  this  sense, 
the  estinator  functions  as  an  sMpirlcal  Bayesian  prediction  model.  This 
procedure  is  most  effectively  used  «dien  the  eMusinee  has  highly  credible 
information  about  specific  esaminees,  which  is  tantasiount  to  having  a 
high  prior  probability,  in  the  usual  Bayesian  sense.  As  a  result,  for 


10 


these  particular  examinees,  the  fit  of  test  scores  to  "true**  scores  may 
be  improved  considerably  by  use  of  a  limited  translation  estimator. 
However,  even  though  the  limited  translation  estimator  yields  a  lower 
aggregate  squared  prediction  error  for  the  set  of  examinees  as  a  whole 
than  does  the  HLE  (sample  mean) ,  it  may  reduce  the  overall  efficiency 
from  that  of  .'’^(X)  by  increasing  the  mean  squared  prediction  errors  for 
other  examinees  in  the  population.  Therefore,  overall  efficiency,  in¬ 
dividual  squared  prediction  error,  and  prior  Information  available  on 
some  examinees  must  all  be  considered  simultaneously  to  determine  what 
translation,  if  ary.  Is  to  be  I'er formed. 

If  there  Is  uniform  prior  information  about  all  examinees  in  the 
score  distribution,  it  may  be  best  to  maximize  the  aggregate  efficiency. 
If  no  information  about  true  scores  is  available,  it  is  impossible  to 
assess  which  individuals  have  the  greatest  squared  prediction  errors 
associated  with  them.  Therefore,  a  good  strategy  would  be  to  achieve 
maximal  aggregate  efficiency. 

If  prior  information  is  concentrated  at  the  extremes  of  the  score 
distribution,  translations  may  be  applied  to  bring  the  predicted  score 
more  in  line  with  the  ty()e  of  score  that  might  be  expected,  based  upon 
prior  information.  In  accomplishing  this  reduction,  however,  one  must 
evaluate  its  effect  on  aggregate  efficiency.  First,  the  individual 
scores  can  be  adjusted  until  they  are  in  line  with  prior  expectations, 
and  the  resulting  aggregate  efficiency  then  evaluated.  Or,  one  can 
focus  on  attaining  maximum  aggregate  efficiency  and  then  notice  how 
the  scores  of  examinees  for  whom  prior  information  is  avail  dile  are 
influenced  by  minor  translations.  A  major  decision  is  to  determine 
at  what  point  score-fitting  for  particular  examinees  becomes  counter¬ 
productive  or  inefficient,  because  minimal  additional  improvements  are 
achieved  at  a  high  cost  to  the  overall  aggregate  efficiency. 

A  case  in  ix>int  is  when  the  "true"  score  does  not  fall  between  the 
HLE  and  but  when  /^(J)  falls  between  the  true  score  and  the  sample 

mean.  ShrinXing  the  difference  between  the  s/uig'le  mean  and  /^(l)  by  ap¬ 
plication  of  a  limited  translation  estimator,  (^) ,  actually  increases 
the  squared  prediction  error  for  that  examinee.  The  reasoning  is  the 
s^uae  when  all  prior  information  on  an  examinee  does  not  fall  between  the 
KLE  (sample  mean)  and  /^(^). 

There  arc  also  several  methodological  considerations  in  relating 
obtained  and  "true"  score  estimates.  Initial  trials  may  underestimate 
a  "true"  score  if  the  learning  curve  has  not  yet  reached  asymptote  in 
this  number  of  test  trials.  Likewise,  fatigue  from  the  last  group  of 
test  itesis  could  produce  an  underestimate  of  the  "true"  score. 
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Many  factor*  ne«d  to  b«  considered  in  relating  observed  scores  and 
true  scores,  in  applying  limited  translations,  in  optimizing  good  indi¬ 
vidual  and  good  aggregate  prediction,  and  in  using  prior  information  on 
specific  exwinees  productively.  The  usefulness  of  Stein's  estimator 
in  behavioral  and  educational  research  largely  depends  upon  how  %eell 
these  considerations  are  addressed. 


SUMMARY 

The  scientific  implications  and  practical  applications  of  the  Stein 
estimator  approach  for  estusating  true  scores  from  observed  scores  are 
of  potentially  great  is^iortance.  The  conceptual  complexity  is  not  much 
greater  than  that  required  for  more  conventional  regression  models.  The 
empirical  Bayesian  aspect  allows  the  examiner  to  incorporate  his/her  own 
degree  of  prior  information  about  selected  examinees.  This  approach 
allows  for  a  snre  accurate  estisuition  of  true  scores,  with  the  corollary’ 
of  using  fewer  teat  itests  to  achieve  those  true  score  estismtes.  Efron 
and  Morris  (197S)  skake  the  point  that  "there  is  little  penalty  for  using 
the  rules  discussed  here  because  they  csuinot  give  large  total  mean  squared 
error  than  the  MLE.  ..."  This  assurance  stay  be  a  sufficient  reason  for 
more  careful  examination  of  the  utility  of  the  Stein  estistator  and  its 
limited  translation  estimators  as  they  apply  to  behavioral  and  social 
science  research. 
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